Working with Holoviews and Geoviews on Barcelona data

In this Notebook we are going to calculate the density of Airbnb's per neighbourhood using Geoviews. Geoviews is built on Holoviews and adds a family of geographic plot types based on the Cartopy library.


In [1]:
%%HTML
<style>
.container{width:75% !important;}
.text_cell_rendered_html{width:20% !important;}
</style>



In [2]:
import pandas as pd 
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from matplotlib.pylab import rcParams
rcParams['figure.figsize'] = 15, 6

In [3]:
import xarray as xr
import numpy as np
import pandas as pd
import holoviews as hv
import geoviews as gv
import geoviews.feature as gf

import cartopy
from cartopy import crs as ccrs

from bokeh.tile_providers import STAMEN_TONER
from bokeh.models import WMTSTileSource

hv.notebook_extension('bokeh')


/home/kalidus/github/holoviews/holoviews/core/util.py:23: FutureWarning: pandas.tslib is deprecated and will be removed in a future version.
You can access Timestamp as pandas.Timestamp
  datetime_types = datetime_types + (pd.tslib.Timestamp,)
/home/kalidus/anaconda3/lib/python3.5/site-packages/matplotlib/cbook.py:136: MatplotlibDeprecationWarning: The Vega10 colormap was deprecated in version 2.0. Use tab10 instead.
  warnings.warn(message, mplDeprecation, stacklevel=1)
/home/kalidus/anaconda3/lib/python3.5/site-packages/matplotlib/cbook.py:136: MatplotlibDeprecationWarning: The Vega20b_r colormap was deprecated in version 2.0. Use tab20b_r instead.
  warnings.warn(message, mplDeprecation, stacklevel=1)
/home/kalidus/anaconda3/lib/python3.5/site-packages/matplotlib/cbook.py:136: MatplotlibDeprecationWarning: The Vega20c_r colormap was deprecated in version 2.0. Use tab20c_r instead.
  warnings.warn(message, mplDeprecation, stacklevel=1)
/home/kalidus/anaconda3/lib/python3.5/site-packages/matplotlib/cbook.py:136: MatplotlibDeprecationWarning: The Vega20 colormap was deprecated in version 2.0. Use tab20 instead.
  warnings.warn(message, mplDeprecation, stacklevel=1)
/home/kalidus/anaconda3/lib/python3.5/site-packages/matplotlib/cbook.py:136: MatplotlibDeprecationWarning: The Vega20c colormap was deprecated in version 2.0. Use tab20c instead.
  warnings.warn(message, mplDeprecation, stacklevel=1)
/home/kalidus/anaconda3/lib/python3.5/site-packages/matplotlib/cbook.py:136: MatplotlibDeprecationWarning: The Vega20_r colormap was deprecated in version 2.0. Use tab20_r instead.
  warnings.warn(message, mplDeprecation, stacklevel=1)
/home/kalidus/anaconda3/lib/python3.5/site-packages/matplotlib/cbook.py:136: MatplotlibDeprecationWarning: The Vega10_r colormap was deprecated in version 2.0. Use tab10_r instead.
  warnings.warn(message, mplDeprecation, stacklevel=1)
/home/kalidus/anaconda3/lib/python3.5/site-packages/matplotlib/cbook.py:136: MatplotlibDeprecationWarning: The spectral and spectral_r colormap was deprecated in version 2.0. Use nipy_spectral and nipy_spectral_r instead.
  warnings.warn(message, mplDeprecation, stacklevel=1)
/home/kalidus/anaconda3/lib/python3.5/site-packages/matplotlib/cbook.py:136: MatplotlibDeprecationWarning: The Vega20b colormap was deprecated in version 2.0. Use tab20b instead.
  warnings.warn(message, mplDeprecation, stacklevel=1)
/home/kalidus/anaconda3/lib/python3.5/site-packages/bokeh/util/deprecation.py:34: BokehDeprecationWarning: 
The bokeh.charts API has moved to a separate 'bkcharts' package.

This compatibility shim will remain until Bokeh 1.0 is released.
After that, if you want to use this API you will have to install
the bkcharts package explicitly.

  warn(message)
HoloViewsJS, BokehJS successfully loaded in this cell.

In [4]:
from bokeh.models import (
    GeoJSONDataSource,
    HoverTool,
    LinearColorMapper
)
from bokeh.plotting import figure
from bokeh.palettes import Viridis6
from bokeh.io import output_notebook, show
output_notebook()


Loading BokehJS ...

1. Load and clean the data:

  • Last scrap of Barcelona airbnb data
  • Official census data: number of families/neighbourhood and number of households/neighbourhood

In [5]:
data = pd.read_csv("data/listings/08042017/listings.csv", low_memory=False)
num_hogares = pd.read_csv("data/num_hogares/NumHogaresYFamilias2011.csv", sep=";", thousands='.')

clean the census data


In [6]:
num_hogares.head()


Out[6]:
Distrito Barrio NumHogares NumFamilias NumNucleos .1 .2
0 1 1.el Raval 17752 10763 11251 NaN NaN NaN
1 1 2.el Barri Gòtic 8209 4313 4295 NaN NaN NaN
2 1 3.la Barceloneta 8164 4932 49 NaN NaN NaN
3 1 4.Sant Pere, Santa Caterina i la Ribera 10380 5744 5628 NaN NaN NaN
4 2 5.el Fort Pienc 13166 9109 8966 NaN NaN NaN

In [7]:
num_hogares.columns = num_hogares.columns.str.lstrip() #get rid of trailing spaces

def drop_digits(in_str): #sorry for this... I'm just too lazy sometimes
    digit_list = "1234567890"
    for char in digit_list:
        in_str = in_str.str.replace(char, "")

    return in_str

num_hogares.Barrio = drop_digits(num_hogares.Barrio)
num_hogares.Barrio = num_hogares.Barrio.str.replace(".", "")

#count the number of listings for each neighbourhood
n_airbnbs_barri= data.neighbourhood_cleansed.value_counts()

#make shure that Series contains neighbourhoods that are in census
n_airbnbs_barri = n_airbnbs_barri[n_airbnbs_barri.index.isin(num_hogares.Barrio)] 

num_hogares = num_hogares[num_hogares.Barrio.isin(n_airbnbs_barri.index)]

num_hogares.index = num_hogares.Barrio
num_hogares.drop("Barrio", axis=1)[:2]

num_hogares = num_hogares.NumHogares
#num_hogares.drop(["Can Peguera", "Baró de Viver", "Torre Baró", "Vallbona"], inplace=True)#outliers

In [8]:
#calculate the density
density = n_airbnbs_barri/num_hogares

In [9]:
n_airbnbs = pd.DataFrame(n_airbnbs_barri)
n_households = pd.DataFrame(num_hogares)

In [10]:
#Add total number of airbnbs and households to df
n_airbnbs['N_Barri'] =  n_airbnbs.index
n_households['N_Barri']= n_households.index

density = pd.DataFrame({"N_Barri":density.index, "value":density.values})

density = density.merge(n_airbnbs, how='left', on='N_Barri')
density = density.merge(n_households, how='left', on='N_Barri')

density.columns = ['N_Barri','value','n_airbnb','n_households']

density.head()


Out[10]:
N_Barri value n_airbnb n_households
0 Baró de Viver 0.004494 4 890
1 Can Baró 0.013831 53 3832
2 Can Peguera 0.001013 1 987
3 Canyelles 0.001909 5 2619
4 Ciutat Meridiana 0.002681 10 3730

2. Load and visualize a geojson

Here we visualize the neighbourhoods from a geoson using only Bokeh


In [11]:
barri_json_path = r"data/divisiones_administrativas/barris/barris_geo.json"
with open(barri_json_path, 'r') as f:
    geo_source = GeoJSONDataSource(geojson=f.read())


TOOLS = "pan,wheel_zoom,box_zoom,reset,hover,save"

p = figure(title="Neighbourhoods", tools=TOOLS, x_axis_location=None,
           y_axis_location=None, width=800, height=800)

p.grid.grid_line_color = None

p.patches('xs', 'ys', fill_alpha=0.7, 
          line_color='white', line_width=1, source=geo_source)

hover = p.select_one(HoverTool)
hover.point_policy = "follow_mouse"
hover.tooltips = [("Neighbourhood", "@N_Barri")]

show(p)


Let's put the data into that shapes

We are going to use shapefiles instead of geoJSON for the sake of simplicity. The conversion between both formats can be done easily.

....but first let's finish formating the density data

read the shapefiles and init the map


In [12]:
shapefile = "data/divisiones_administrativas/barris/shape/barris_geo.shp"
shapes = cartopy.io.shapereader.Reader(shapefile)


density_hv = hv.Dataset(density)

density_hv.data.dropna(inplace=True)

The magic of holoviews


In [13]:
%%opts Overlay [width=1000 height=1000 xaxis=None yaxis=None] 
%%output filename="holoviewsmap"

gv.Shape.from_records(shapes.records(), density_hv, on='N_Barri', value='value',
                      index=['N_Barri','n_airbnb','n_households'], #hack to make them appear at the hoovertool
                      crs=ccrs.PlateCarree(), group="Densitat Airbnb Barcelona per nombre d'hogars",
                      drop_missing=False)

%%opts Shape (cmap='Reds') [tools=['hover'] width=1000 height=1000 colorbar=True toolbar='above' xaxis=None yaxis=None]


/home/kalidus/anaconda3/lib/python3.5/site-packages/bokeh/util/deprecation.py:34: BokehDeprecationWarning: webgl was deprecated in Bokeh 0.12.6 and will be removed, use output_backend instead.
  warn(message)
/home/kalidus/anaconda3/lib/python3.5/site-packages/bokeh/util/deprecation.py:34: BokehDeprecationWarning: webgl was deprecated in Bokeh 0.12.6 and will be removed, use output_backend instead.
  warn(message)
/home/kalidus/anaconda3/lib/python3.5/site-packages/bokeh/util/deprecation.py:34: BokehDeprecationWarning: webgl was deprecated in Bokeh 0.12.6 and will be removed, use output_backend instead.
  warn(message)
/home/kalidus/anaconda3/lib/python3.5/site-packages/bokeh/util/deprecation.py:34: BokehDeprecationWarning: webgl was deprecated in Bokeh 0.12.6 and will be removed, use output_backend instead.
  warn(message)
Out[13]:

In [ ]:


In [ ]: